16 research outputs found
Robustness Testing of Intermediate Verifiers
Program verifiers are not exempt from the bugs that affect nearly every piece
of software. In addition, they often exhibit brittle behavior: their
performance changes considerably with details of how the input program is
expressed-details that should be irrelevant, such as the order of independent
declarations. Such a lack of robustness frustrates users who have to spend
considerable time figuring out a tool's idiosyncrasies before they can use it
effectively.
This paper introduces a technique to detect lack of robustness of program
verifiers; the technique is lightweight and fully automated, as it is based on
testing methods (such as mutation testing and metamorphic testing). The key
idea is to generate many simple variants of a program that initially passes
verification. All variants are, by construction, equivalent to the original
program; thus, any variant that fails verification indicates lack of robustness
in the verifier.
We implemented our technique in a tool called "mugie", which operates on
programs written in the popular Boogie language for verification-used as
intermediate representation in numerous program verifiers. Experiments
targeting 135 Boogie programs indicate that brittle behavior occurs fairly
frequently (16 programs) and is not hard to trigger. Based on these results,
the paper discusses the main sources of brittle behavior and suggests means of
improving robustness
A formally verified compiler back-end
This article describes the development and formal verification (proof of
semantic preservation) of a compiler back-end from Cminor (a simple imperative
intermediate language) to PowerPC assembly code, using the Coq proof assistant
both for programming the compiler and for proving its correctness. Such a
verified compiler is useful in the context of formal methods applied to the
certification of critical software: the verification of the compiler guarantees
that the safety properties proved on the source code hold for the executable
compiled code as well
Advances in Property-Based Testing for αProlog
Check is a light-weight property-based testing tool built on top of
Prolog, a logic programming language based on nominal logic.
Prolog is particularly suited to the validation of the meta-theory of
formal systems, for example correctness of compiler translations involving
name-binding, alpha-equivalence and capture-avoiding substitution. In this
paper we describe an alternative to the negation elimination algorithm
underlying Check that substantially improves its effectiveness. To
substantiate this claim we compare the checker performances w.r.t. two of its
main competitors in the logical framework niche, namely the QuickCheck/Nitpick
combination offered by Isabelle/HOL and the random testing facility in
PLT-Redex.Comment: To appear, Tests and Proofs 2016; includes appendix with details not
in the conference versio
Leveraging metamorphic testing to automatically detect inconsistencies in code generator families
International audienceGenerative software development has paved the way for the creation of multiple code generators that serve as a basis for automatically generating code to different software and hardware platforms. In this context, the software quality becomes highly correlated to the quality of code generators used during software development. Eventual failures may result in a loss of confidence for the developers, who will unlikely continue to use these generators. It is then crucial to verify the correct behaviour of code generators in order to preserve software quality and reliability. In this paper, we leverage the metamorphic testing approach to automatically detect inconsistencies in code generators via so-called "metamorphic relations". We define the metamorphic relation (i.e., test oracle) as a comparison between the variations of performance and resource usage of test suites running on different versions of generated code. We rely on statistical methods to find the threshold value from which an unexpected variation is detected. We evaluate our approach by testing a family of code generators with respect to resource usage and performance metrics for five different target software platforms. The experimental results show that our approach is able to detect, among 95 executed test suites, 11 performance and 15 memory usage inconsistencies
A Benchmark Generator for Online First-Order Monitoring
We present a randomized benchmark generator for attesting the correctness and performance of online first-order monitors. The benchmark generator consists of three components: a stream generator, a stream replayer, and a monitoring oracle. The stream generator produces random event streams that conform to user-defined characteristics such as event frequencies and distributions of the eventsâ parameters. The stream replayer reproduces event streams in real time at a user-defined velocity. By varying the stream characteristics and velocity, one can analyze their impact on the monitorâs performance. The monitoring oracle provides the expected result of monitoring the generated streams against metric first-order regular specifications. The specification languages supported by most existing monitors are either a subset of or share a large common fragment with the oracleâs language. Thus, we envision that our benchmark generator will be used as a standard correctness and performance testing tool for online monitors.ISSN:0302-9743ISSN:1611-334